Search Engine-Crawler Symbiosis: Adapting to Community Interests

نویسندگان

Gautam Pant

Shannon Bradshaw

Filippo Menczer

چکیده

Web crawlers have been used for nearly a decade as a search engine component to create and update large collections of documents. Typically the crawler and the rest of the search engine are not closely integrated. If the purpose of a search engine is to have as large a collection as possible to serve the general Web community, a close integration may not be necessary. However, if the search engine caters to a specific community with shared focused interests, it can take advantage of such an integration. In this paper we investigate a tightly coupled system in which the crawler and the search engine engage in a symbiotic relationship. The crawler feeds the search engine and the search engine in turn helps the crawler to better its performance. We show that the symbiosis can help the system learn about a community’s interests and serve such a community with better focus.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Search Engine-Crawler Symbiosis

متن کامل

DHT-Based Distributed Crawler

A search engine, like Google, is built using two pieces of infrastructure a crawler that indexes the web and a searcher that uses the index to answer user queries. While Google's crawler has worked well, there is the issue of timeliness and the lack of control given to end-users to direct the crawl according to their interests. The interface presented by such search engines is hence very limite...

متن کامل

A Grid Focused Community Crawling Architecture for Medical Information Retrieval Services

This paper describes a GRID focused community crawling architecture and its possible adoption in a medical information domain. This architecture has been designed for handling a retrieval information service to individuals that are entitled to access the highly distributed computational power of the GRID, eliminating the need of a central authority/repository such as a unique search engine. In ...

متن کامل

Designing and Implementation of "regional Crawler" as a New Strategy for Crawling the Web

By the rapid growth of the World Wide Web, the significance and popularity of search engines are increasing day by day. However, today web crawlers are unable to update their search engine indexes concurrent to the growth in the information available on the web. This sometimes causes users to be unable to search on recent or updated information. Regional Crawler that we are proposing in this pa...

متن کامل

Web Crawler: Extracting the Web Data

Internet usage has increased a lot in recent times. Users can find their resources by using different hypertext links. This usage of Internet has led to the invention of web crawlers. Web crawlers are full text search engines which assist users in navigating the web. These web crawlers can also be used in further research activities. For e.g. the crawled data can be used to find missing links, ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

Search Engine-Crawler Symbiosis: Adapting to Community Interests

نویسندگان

چکیده

منابع مشابه

Search Engine-Crawler Symbiosis

DHT-Based Distributed Crawler

A Grid Focused Community Crawling Architecture for Medical Information Retrieval Services

Designing and Implementation of "regional Crawler" as a New Strategy for Crawling the Web

Web Crawler: Extracting the Web Data

عنوان ژورنال:

اشتراک گذاری